AITopics | extrinsic reward

Collaborating Authors

extrinsic reward

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

To facilitate the following derivation, we rewrite the objective J E+I(E+I) JE(E): 438 J E+I(E+I) JE(E) = E E+ I h 1X

Neural Information Processing SystemsApr-25-2026, 01:03:39 GMT

A.1 Full derivation425 We present the complete derivation of the objective function in each subproblem defined in Section426 3.2. For brevity, let rt =(1+)rEt +rIt and V EE (st)= Vt. Under this assumption, E serves as 0 (see above). This451 enables updating E+I using the local approximation. We leave relaxing this assumption as future452 work.453

artificial intelligence, machine learning, objective, (19 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

204fee94c982a19230c39045aa54f977-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 01:03:35 GMT

artificial intelligence, intrinsic reward, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.93)

Industry:

Government (0.68)
Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Explore to Generalize in Zero-Shot RL

Neural Information Processing SystemsFeb-17-2026, 01:04:43 GMT

Recent developments in reinforcement learning (RL) led to algorithms that surpass human experts in a broad range of tasks [Mnih et al., 2015, Vinyals et al., 2019, Schrittwieser et al., 2020, Wurman et al.,

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language (0.82)

Add feedback

Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration

Neural Information Processing SystemsFeb-14-2026, 00:47:24 GMT

The ability to approach the same problem from different angles is a cornerstone of human intelligence that leads to robust solutions and effective adaptation to problem variations. In contrast, current RL methodologies tend to lead to policies that settle on a single solution to a given problem, making them brittle to problem variations. Replicating human flexibility in reinforcement learning agents is the challenge that we explore in this work.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

On Learning Intrinsic Rewards for Policy Gradient Methods

Zeyu Zheng, Junhyuk Oh, Satinder Singh

Neural Information Processing SystemsFeb-12-2026, 20:13:39 GMT

Whether itispossible tolearn intrinsic reward functions for learning agents remains an open problem.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

c43b2989b1ba055aa713a4abbe4a8b05-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 19:37:20 GMT

exploration, international conference, intrinsic reward, (13 more...)

Neural Information Processing Systems

Country:

Europe > France (0.04)
Europe > Austria (0.04)
Asia > South Korea (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

abe8e03e3ac71c2ec3bfb0de042638d8-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 14:38:12 GMT

agent, exploration, exploration policy, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Multi-Objective Intrinsic Reward Learning for Conversational Recommender Systems

Neural Information Processing SystemsFeb-10-2026, 09:00:18 GMT

Conversational Recommender Systems (CRS) actively elicit user preferences to generate adaptive recommendations. Mainstream reinforcement learning-based CRS solutions heavily rely on handcrafted reward functions, which may not be aligned with user intent in CRS tasks.

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Country: